Overview

Dataset statistics

Number of variables17
Number of observations54881
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.1 MiB
Average record size in memory136.0 B

Variable types

Numeric9
Categorical8

Alerts

df_index is highly correlated with idHigh correlation
id is highly correlated with df_indexHigh correlation
weight is highly correlated with bmi and 1 other fieldsHigh correlation
ap_hi is highly correlated with ap_lo and 2 other fieldsHigh correlation
ap_lo is highly correlated with ap_hi and 1 other fieldsHigh correlation
bmi is highly correlated with weight and 1 other fieldsHigh correlation
bmi_class is highly correlated with weight and 1 other fieldsHigh correlation
bp_class is highly correlated with ap_hi and 1 other fieldsHigh correlation
gender is highly correlated with smokeHigh correlation
height is highly correlated with genderHigh correlation
cholesterol is highly correlated with glucHigh correlation
gluc is highly correlated with cholesterolHigh correlation
smoke is highly correlated with gender and 1 other fieldsHigh correlation
alco is highly correlated with smokeHigh correlation
cardio is highly correlated with ap_hiHigh correlation
df_index is uniformly distributed Uniform
id is uniformly distributed Uniform
df_index has unique values Unique
id has unique values Unique
bmi_class has 3863 (7.0%) zeros Zeros

Reproduction

Analysis started2022-12-09 10:19:34.585624
Analysis finished2022-12-09 10:19:56.401299
Duration21.82 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct54881
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34991.71881
Minimum0
Maximum69999
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size428.9 KiB
2022-12-09T07:19:56.465558image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3512
Q117472
median34954
Q352537
95-th percentile66522
Maximum69999
Range69999
Interquartile range (IQR)35065

Descriptive statistics

Standard deviation20217.69519
Coefficient of variation (CV)0.5777851412
Kurtosis-1.201013232
Mean34991.71881
Median Absolute Deviation (MAD)17531
Skewness0.002483918404
Sum1920380520
Variance408755198.9
MonotonicityNot monotonic
2022-12-09T07:19:57.169493image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
473391
 
< 0.1%
291901
 
< 0.1%
664731
 
< 0.1%
58921
 
< 0.1%
588231
 
< 0.1%
650671
 
< 0.1%
491921
 
< 0.1%
615591
 
< 0.1%
215871
 
< 0.1%
624281
 
< 0.1%
Other values (54871)54871
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
ValueCountFrequency (%)
699991
< 0.1%
699981
< 0.1%
699971
< 0.1%
699961
< 0.1%
699941
< 0.1%
699931
< 0.1%
699921
< 0.1%
699911
< 0.1%
699901
< 0.1%
699891
< 0.1%

id
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct54881
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49961.32993
Minimum0
Maximum99999
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size428.9 KiB
2022-12-09T07:19:57.247596image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4976
Q124965
median49928
Q374943
95-th percentile94969
Maximum99999
Range99999
Interquartile range (IQR)49978

Descriptive statistics

Standard deviation28865.97933
Coefficient of variation (CV)0.5777664319
Kurtosis-1.199381969
Mean49961.32993
Median Absolute Deviation (MAD)24987
Skewness0.001210067816
Sum2741927748
Variance833244762.6
MonotonicityNot monotonic
2022-12-09T07:19:57.325716image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
676171
 
< 0.1%
417031
 
< 0.1%
948981
 
< 0.1%
83661
 
< 0.1%
839561
 
< 0.1%
928701
 
< 0.1%
702271
 
< 0.1%
878911
 
< 0.1%
308361
 
< 0.1%
891481
 
< 0.1%
Other values (54871)54871
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
81
< 0.1%
91
< 0.1%
121
< 0.1%
131
< 0.1%
141
< 0.1%
151
< 0.1%
ValueCountFrequency (%)
999991
< 0.1%
999981
< 0.1%
999961
< 0.1%
999951
< 0.1%
999921
< 0.1%
999911
< 0.1%
999901
< 0.1%
999881
< 0.1%
999861
< 0.1%
999851
< 0.1%

age
Real number (ℝ≥0)

Distinct28
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.81468997
Minimum29
Maximum64
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size428.9 KiB
2022-12-09T07:19:57.403840image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum29
5-th percentile41
Q148
median53
Q358
95-th percentile63
Maximum64
Range35
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.775996528
Coefficient of variation (CV)0.1282975728
Kurtosis-0.8208755654
Mean52.81468997
Median Absolute Deviation (MAD)5
Skewness-0.3020382886
Sum2898523
Variance45.91412895
MonotonicityNot monotonic
2022-12-09T07:19:57.466324image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
553091
 
5.6%
533049
 
5.6%
572836
 
5.2%
592829
 
5.2%
542820
 
5.1%
562819
 
5.1%
492696
 
4.9%
582644
 
4.8%
512610
 
4.8%
522599
 
4.7%
Other values (18)26888
49.0%
ValueCountFrequency (%)
293
 
< 0.1%
301
 
< 0.1%
391412
2.6%
401300
2.4%
411510
2.8%
421094
2.0%
431593
2.9%
441195
2.2%
451630
3.0%
461285
2.3%
ValueCountFrequency (%)
641723
3.1%
632132
3.9%
621736
3.2%
612099
3.8%
602496
4.5%
592829
5.2%
582644
4.8%
572836
5.2%
562819
5.1%
553091
5.6%

gender
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size428.9 KiB
1
35711 
2
19170 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters54881
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row2
5th row1

Common Values

ValueCountFrequency (%)
135711
65.1%
219170
34.9%

Length

2022-12-09T07:19:57.544431image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-09T07:19:57.606925image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
135711
65.1%
219170
34.9%

Most occurring characters

ValueCountFrequency (%)
135711
65.1%
219170
34.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number54881
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
135711
65.1%
219170
34.9%

Most occurring scripts

ValueCountFrequency (%)
Common54881
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
135711
65.1%
219170
34.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII54881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
135711
65.1%
219170
34.9%

height
Real number (ℝ≥0)

HIGH CORRELATION

Distinct84
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean164.3915927
Minimum100
Maximum250
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size428.9 KiB
2022-12-09T07:19:57.669414image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile152
Q1159
median165
Q3170
95-th percentile178
Maximum250
Range150
Interquartile range (IQR)11

Descriptive statistics

Standard deviation7.96864217
Coefficient of variation (CV)0.04847353833
Kurtosis1.526689503
Mean164.3915927
Median Absolute Deviation (MAD)5
Skewness-0.05029660799
Sum9021975
Variance63.49925803
MonotonicityNot monotonic
2022-12-09T07:19:57.747534image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1654581
 
8.3%
1603990
 
7.3%
1703674
 
6.7%
1683436
 
6.3%
1642672
 
4.9%
1582567
 
4.7%
1622564
 
4.7%
1692200
 
4.0%
1562196
 
4.0%
1632028
 
3.7%
Other values (74)24973
45.5%
ValueCountFrequency (%)
1002
 
< 0.1%
1042
 
< 0.1%
1052
 
< 0.1%
1081
 
< 0.1%
1091
 
< 0.1%
1106
< 0.1%
1111
 
< 0.1%
1121
 
< 0.1%
1131
 
< 0.1%
1172
 
< 0.1%
ValueCountFrequency (%)
2501
 
< 0.1%
2071
 
< 0.1%
19810
< 0.1%
1974
 
< 0.1%
1966
< 0.1%
1955
< 0.1%
1941
 
< 0.1%
1936
< 0.1%
1929
< 0.1%
1917
< 0.1%

weight
Real number (ℝ≥0)

HIGH CORRELATION

Distinct240
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean74.12450538
Minimum40
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size428.9 KiB
2022-12-09T07:19:57.825653image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile55
Q165
median72
Q382
95-th percentile100
Maximum200
Range160
Interquartile range (IQR)17

Descriptive statistics

Standard deviation14.25051388
Coefficient of variation (CV)0.1922510485
Kurtosis2.559568857
Mean74.12450538
Median Absolute Deviation (MAD)8
Skewness1.018748821
Sum4068026.98
Variance203.0771459
MonotonicityNot monotonic
2022-12-09T07:19:57.903731image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
653037
 
5.5%
702988
 
5.4%
682209
 
4.0%
602152
 
3.9%
752114
 
3.9%
802103
 
3.8%
721783
 
3.2%
691695
 
3.1%
781639
 
3.0%
621471
 
2.7%
Other values (230)33690
61.4%
ValueCountFrequency (%)
4031
 
0.1%
4130
 
0.1%
4234
 
0.1%
42.21
 
< 0.1%
4350
 
0.1%
4450
 
0.1%
45100
0.2%
4676
0.1%
4789
0.2%
48133
0.2%
ValueCountFrequency (%)
2002
< 0.1%
1804
< 0.1%
1782
< 0.1%
1721
 
< 0.1%
1711
 
< 0.1%
1702
< 0.1%
1691
 
< 0.1%
1682
< 0.1%
1672
< 0.1%
1662
< 0.1%

ap_hi
Real number (ℝ≥0)

HIGH CORRELATION

Distinct97
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean126.5520854
Minimum80
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size428.9 KiB
2022-12-09T07:19:57.982290image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum80
5-th percentile100
Q1120
median120
Q3140
95-th percentile160
Maximum200
Range120
Interquartile range (IQR)20

Descriptive statistics

Standard deviation16.53810059
Coefficient of variation (CV)0.1306821657
Kurtosis1.317943987
Mean126.5520854
Median Absolute Deviation (MAD)10
Skewness0.833832394
Sum6945305
Variance273.5087712
MonotonicityNot monotonic
2022-12-09T07:19:58.076029image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12022142
40.3%
1407462
 
13.6%
1307102
 
12.9%
1106870
 
12.5%
1503350
 
6.1%
1602214
 
4.0%
1002073
 
3.8%
90785
 
1.4%
170521
 
0.9%
180490
 
0.9%
Other values (87)1872
 
3.4%
ValueCountFrequency (%)
8076
 
0.1%
857
 
< 0.1%
90785
 
1.4%
931
 
< 0.1%
9528
 
0.1%
962
 
< 0.1%
971
 
< 0.1%
992
 
< 0.1%
1002073
3.8%
1014
 
< 0.1%
ValueCountFrequency (%)
20077
 
0.1%
1971
 
< 0.1%
1951
 
< 0.1%
1932
 
< 0.1%
1912
 
< 0.1%
19091
 
0.2%
1881
 
< 0.1%
1857
 
< 0.1%
180490
0.9%
1794
 
< 0.1%

ap_lo
Real number (ℝ≥0)

HIGH CORRELATION

Distinct75
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean81.32477542
Minimum50
Maximum140
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size428.9 KiB
2022-12-09T07:19:58.154136image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile70
Q180
median80
Q390
95-th percentile100
Maximum140
Range90
Interquartile range (IQR)10

Descriptive statistics

Standard deviation9.444009087
Coefficient of variation (CV)0.1161270847
Kurtosis1.794702921
Mean81.32477542
Median Absolute Deviation (MAD)0
Skewness0.3894453676
Sum4463185
Variance89.18930763
MonotonicityNot monotonic
2022-12-09T07:19:58.216629image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8027815
50.7%
9011381
20.7%
708134
 
14.8%
1003245
 
5.9%
602177
 
4.0%
110297
 
0.5%
79295
 
0.5%
85235
 
0.4%
75177
 
0.3%
120148
 
0.3%
Other values (65)977
 
1.8%
ValueCountFrequency (%)
5043
 
0.1%
522
 
< 0.1%
533
 
< 0.1%
541
 
< 0.1%
554
 
< 0.1%
561
 
< 0.1%
573
 
< 0.1%
584
 
< 0.1%
5915
 
< 0.1%
602177
4.0%
ValueCountFrequency (%)
14021
 
< 0.1%
1351
 
< 0.1%
13024
 
< 0.1%
1261
 
< 0.1%
1252
 
< 0.1%
1221
 
< 0.1%
1211
 
< 0.1%
120148
0.3%
1192
 
< 0.1%
1181
 
< 0.1%

cholesterol
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size428.9 KiB
1
41169 
2
7455 
3
6257 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters54881
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
141169
75.0%
27455
 
13.6%
36257
 
11.4%

Length

2022-12-09T07:19:58.294747image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-09T07:19:58.357234image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
141169
75.0%
27455
 
13.6%
36257
 
11.4%

Most occurring characters

ValueCountFrequency (%)
141169
75.0%
27455
 
13.6%
36257
 
11.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number54881
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
141169
75.0%
27455
 
13.6%
36257
 
11.4%

Most occurring scripts

ValueCountFrequency (%)
Common54881
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
141169
75.0%
27455
 
13.6%
36257
 
11.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII54881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
141169
75.0%
27455
 
13.6%
36257
 
11.4%

gluc
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size428.9 KiB
1
46704 
3
 
4116
2
 
4061

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters54881
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
146704
85.1%
34116
 
7.5%
24061
 
7.4%

Length

2022-12-09T07:19:58.419658image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-09T07:19:58.466528image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
146704
85.1%
34116
 
7.5%
24061
 
7.4%

Most occurring characters

ValueCountFrequency (%)
146704
85.1%
34116
 
7.5%
24061
 
7.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number54881
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
146704
85.1%
34116
 
7.5%
24061
 
7.4%

Most occurring scripts

ValueCountFrequency (%)
Common54881
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
146704
85.1%
34116
 
7.5%
24061
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII54881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
146704
85.1%
34116
 
7.5%
24061
 
7.4%

smoke
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size428.9 KiB
0
50014 
1
 
4867

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters54881
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
050014
91.1%
14867
 
8.9%

Length

2022-12-09T07:19:58.529265image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-09T07:19:58.576547image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
050014
91.1%
14867
 
8.9%

Most occurring characters

ValueCountFrequency (%)
050014
91.1%
14867
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number54881
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
050014
91.1%
14867
 
8.9%

Most occurring scripts

ValueCountFrequency (%)
Common54881
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
050014
91.1%
14867
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII54881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
050014
91.1%
14867
 
8.9%

alco
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size428.9 KiB
0
51885 
1
 
2996

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters54881
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
051885
94.5%
12996
 
5.5%

Length

2022-12-09T07:19:58.639016image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-09T07:19:58.685812image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
051885
94.5%
12996
 
5.5%

Most occurring characters

ValueCountFrequency (%)
051885
94.5%
12996
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number54881
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
051885
94.5%
12996
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Common54881
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
051885
94.5%
12996
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII54881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
051885
94.5%
12996
 
5.5%

active
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size428.9 KiB
1
44093 
0
10788 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters54881
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
144093
80.3%
010788
 
19.7%

Length

2022-12-09T07:19:58.732753image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-09T07:19:58.795996image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
144093
80.3%
010788
 
19.7%

Most occurring characters

ValueCountFrequency (%)
144093
80.3%
010788
 
19.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number54881
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
144093
80.3%
010788
 
19.7%

Most occurring scripts

ValueCountFrequency (%)
Common54881
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
144093
80.3%
010788
 
19.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII54881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
144093
80.3%
010788
 
19.7%

bp_class
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size428.9 KiB
2
31832 
3
12802 
0
7595 
1
 
2510
4
 
142

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters54881
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row3
4th row2
5th row2

Common Values

ValueCountFrequency (%)
231832
58.0%
312802
23.3%
07595
 
13.8%
12510
 
4.6%
4142
 
0.3%

Length

2022-12-09T07:19:58.842370image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-09T07:19:58.904907image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
231832
58.0%
312802
23.3%
07595
 
13.8%
12510
 
4.6%
4142
 
0.3%

Most occurring characters

ValueCountFrequency (%)
231832
58.0%
312802
23.3%
07595
 
13.8%
12510
 
4.6%
4142
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number54881
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
231832
58.0%
312802
23.3%
07595
 
13.8%
12510
 
4.6%
4142
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Common54881
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
231832
58.0%
312802
23.3%
07595
 
13.8%
12510
 
4.6%
4142
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII54881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
231832
58.0%
312802
23.3%
07595
 
13.8%
12510
 
4.6%
4142
 
0.3%

bmi
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2093
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.68094386
Minimum12.14
Maximum80.24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size428.9 KiB
2022-12-09T07:19:58.983016image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum12.14
5-th percentile18.04
Q120.71
median22.9
Q326.03
95-th percentile31.78
Maximum80.24
Range68.1
Interquartile range (IQR)5.32

Descriptive statistics

Standard deviation4.385587751
Coefficient of variation (CV)0.1851948038
Kurtosis4.273553907
Mean23.68094386
Median Absolute Deviation (MAD)2.55
Skewness1.257398333
Sum1299633.88
Variance19.23337993
MonotonicityNot monotonic
2022-12-09T07:19:59.061061image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20.71748
 
1.4%
19.79522
 
1.0%
21.55503
 
0.9%
22.3404
 
0.7%
21.85370
 
0.7%
22.16354
 
0.6%
19.12341
 
0.6%
23.09340
 
0.6%
22.78332
 
0.6%
20.07324
 
0.6%
Other values (2083)50643
92.3%
ValueCountFrequency (%)
12.141
 
< 0.1%
12.751
 
< 0.1%
12.882
< 0.1%
13.021
 
< 0.1%
13.11
 
< 0.1%
13.161
 
< 0.1%
13.261
 
< 0.1%
13.293
< 0.1%
13.383
< 0.1%
13.473
< 0.1%
ValueCountFrequency (%)
80.241
< 0.1%
77.331
< 0.1%
76.841
< 0.1%
70.211
< 0.1%
69.111
< 0.1%
62.291
< 0.1%
57.431
< 0.1%
56.722
< 0.1%
55.711
< 0.1%
55.671
< 0.1%

bmi_class
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.353073013
Minimum0
Maximum5
Zeros3863
Zeros (%)7.0%
Negative0
Negative (%)0.0%
Memory size428.9 KiB
2022-12-09T07:19:59.155186image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7988181895
Coefficient of variation (CV)0.5903733074
Kurtosis2.510469159
Mean1.353073013
Median Absolute Deviation (MAD)0
Skewness1.237117311
Sum74258
Variance0.6381104999
MonotonicityNot monotonic
2022-12-09T07:19:59.202055image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
133657
61.3%
212738
 
23.2%
03863
 
7.0%
33611
 
6.6%
4768
 
1.4%
5244
 
0.4%
ValueCountFrequency (%)
03863
 
7.0%
133657
61.3%
212738
 
23.2%
33611
 
6.6%
4768
 
1.4%
5244
 
0.4%
ValueCountFrequency (%)
5244
 
0.4%
4768
 
1.4%
33611
 
6.6%
212738
 
23.2%
133657
61.3%
03863
 
7.0%

cardio
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size428.9 KiB
0
27738 
1
27143 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters54881
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
027738
50.5%
127143
49.5%

Length

2022-12-09T07:19:59.265454image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-09T07:19:59.328260image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
027738
50.5%
127143
49.5%

Most occurring characters

ValueCountFrequency (%)
027738
50.5%
127143
49.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number54881
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
027738
50.5%
127143
49.5%

Most occurring scripts

ValueCountFrequency (%)
Common54881
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
027738
50.5%
127143
49.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII54881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
027738
50.5%
127143
49.5%

Interactions

2022-12-09T07:19:55.263212image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:49.535246image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.298901image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.982145image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.704604image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.421910image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.151042image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.852829image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.547615image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.348273image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:49.630861image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.377459image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.076236image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.786957image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.506864image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.232203image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.915440image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.627141image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.420115image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:49.708468image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.453614image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.151025image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.865795image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.583184image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.307331image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.002712image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.703592image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.482882image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:49.794797image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.516084image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.227749image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.949560image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.665997image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.387589image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.081410image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.783958image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.578346image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:49.880557image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.605173image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.308581image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.029757image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.733872image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.467910image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.168646image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.864571image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.657700image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:49.971131image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.683843image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.395963image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.113185image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.829528image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.548431image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.251711image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.955280image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.734108image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.050486image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.760234image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.471721image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.192000image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.907059image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.626028image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.327225image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.034596image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.809261image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.137314image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.832337image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.549361image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.266454image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.986863image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.697946image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.384272image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.106036image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.886568image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.221160image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:50.913863image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:51.627913image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:52.345754image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.067184image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:53.777485image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:54.472668image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-09T07:19:55.185910image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-12-09T07:19:59.391073image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-12-09T07:19:59.562932image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-09T07:19:59.735879image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-09T07:19:59.861438image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-09T07:19:59.970873image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-12-09T07:20:00.064614image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-09T07:19:55.999787image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-09T07:19:56.215304image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexidagegenderheightweightap_hiap_locholesterolglucsmokealcoactivebp_classbmibmi_classcardio
0473396761759115480.01309021001227.5121
1674569632045216270.01409011000322.7711
2123081757157217492.015010011001327.5421
3325574649264217376.01208211001222.9111
466494555116060.01208011001219.7910
52100329995591156120.01208032001240.7051
6296374237350217987.01108011001225.1620
7579378268563116879.01509011001324.6511
82368033853552180103.01208031001229.5821
9263063757663216670.01208011001222.1511

Last rows

df_indexidagegenderheightweightap_hiap_locholesterolglucsmokealcoactivebp_classbmibmi_classcardio
54871672219597640115054.01106011001019.0910
54872410905871940114647.01108011001217.0800
54873160232288340116963.01208011001219.5310
54874602638603950116283.01309031001227.0020
54875441316304753116961.01709022000318.9111
54876371945313743217075.01508011101223.0811
548776265891863216273.01609011001323.7511
54878548867830264116974.01208011001222.9310
54879860119749116770.01208011000222.0010
54880157952255641217764.01208011001218.7710